Optimization of ground and excited state wavefunctions and van der Waals clusters 
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A quantum Monte Carlo method is introduced to optimize 
excited state trial wavefunctions. The method is applied in 
a correlation function Monte Carlo calculation to compute 
ground and excited state energies of bosonic van der Waals 
clusters of upto seven particles. The calculations are per- 
formed using trial wavefunctions with general three-body cor- 
relations. 

PACS codes: 03.65 02. 5 O.N, 02.70. L 36.40.M 34.30 

Weakly-bound clusters display strong anharmonicity, 
and this makes solving the Schrodinger equation for such 
systems a formidable computational challenge. The dis- 
crete variable representation method (DVR) has been 
applied with success to compute the energies of vibra- 
tional states for systems with upto six degrees of free- 
dom. Its computational complexity scales exponentially 
with dimension, a problem that Monte Carlo methods 
can avoid. Indeed, correlation function Monte Carlo [^]]|] 
and the projector operator imaginary time spectral evo- 
lution (POITSE) Q methods are applicable to higher 
dimensional systems, although in practice they are re- 
stricted to a smaller number of excited states. 

The accuracy of Monte Carlo projection methods can 
be improved dramatically by employing approximate 
eigenf unctions. In fact, without good initial approxima- 
tions one rarely obtains results of sufficient accuracy. In 
this Letter, we discuss a systematic and efficient method 
to construct approximate eigenfunctions by optimization 
of many-parameter trial functions. We then use these 
functions in a correlation function Monte Carlo calcula- 
tion. We expect that also POITSE calculations can be 
improved by the same means. A variant of the method 
described here was applied previously to study critical 
dynamics of lattice systems. (||| 

We compute energy levels of bosonic van der Waals 
clusters of atoms of mass /i, interacting via a Lennard- 
Jones potential with core radius a and well depth e. In 
dimensionless form, the pair potential is r~ 12 — 2r~ 6 and 
the Hamiltonian is H — P 2 /2m + V; P 2 /2m and V are 
the total kinetic and potential energy. The only param- 
eter is the inverse dimensionless mass to -1 = h 2 /fj,a 2 e, 
which is proportional to the square of the de Boer pa- 
rameter. [Q 

We use the position representation, and denote by R 
the Cartesian coordinates of a cluster of N c atoms. For 
the parameter optimization we generate a sample of con- 
figurations Ru, a = 1, . . . , E, sampled with relative prob- 
abilities i/) g (i? CT ) 2 ; the choice of the guiding function ip 2 
will be discussed later. The sample typically consists 
of several thousands configurations, which are kept fixed 



during the optimization. 

The trial functions are linear combinations of about 
a hundred elementary basis functions, each of which de- 
pends on non-linear parameters. Correspondingly, we 
have a linear optimization nested in a non-linear one. 
The result is a set of functions serving as basis functions 
in a correlation function Monte Carlo calculation. These 
basis functions are constructed one at a time, from the 
ground state up, as follows. 

Suppose we fix at initial values the non-linear pa- 
rameters of the elementary basis functions denoted by 
[3i, i = l,...,n. Ideally, these functions span an n- 
dimensional invariant subspace of the Hamiltonian 7i, 
and then there exists an n x n matrix £ so that 



Hf3 l {R a ) = Y,0 J {Ra)S Jl - 

3 

In that case, for k = 1, . . . , n, 



(i) 



(2) 



is an eigenvector of Ti with an eigenvalue equal to the 
exact energy Ek, if dS k ^ is a right eigenvector of £ with 
eigenvalue E^. We rewrite Eq. (Fy) in matrix form 



B' = BE, 



(3) 



where B al = and B' ai = Pi(R<,), with ft = &/<0 g 

and ft = Wi/ip s . 

In practice, the subspace spanned by the basis func- 
tions is not invariant, so that for given matrices B and 
B' Eq. (||) is an overdetermined set of equations for the 
unknown matrix £. If one multiplies through from the 
left by B T , the transpose of B, one obtains by inversion 



£ = {B T B)- 1 (B T B') = N^H. 



(4) 



As readily verified, this is the least-squares solution of 
Eq. (||); note that the rows of the matrices B and B' are 
weighted by the guiding function so that the elements 
of the matrices TV and H approach the standard quan- 
tum mechanical overlap integrals and matrix elements in 
the limit of an infinite Monte Carlo sample. Eqs. (||) 
and (||) are usually derived from stationarity (in the lin- 
ear parameters) of the average energy. If the latter is 
estimated by a finite-sample average, requiring station- 
arity of this estimate yields Eq. with H replaced by 
its symmetrized analog. Since the exact quantum me- 
chanical expression is indeed symmetrical, one might be 



inclined to use the symmetrized H. However, only the 
non-symmetric expression B T B' in Eq. (f|) satisfies the 
zero- variance principle of yielding exact results indepen- 
dent of the configuration sample if the basis functions 
span an invariant subspace of the Hamiltonian. As in 
the ideal case, Eq. (J2j> determines the linearly optimized 
trial functions, but now one has E^ ^ E^, an inequality 
H which for a finite Monte Carlo sample may be violated 
because of statistical noise. 

The solution for £ as written in Eq. (||) is numerically 
unstable since the matrix N is ill conditioned because of 
near-linear dependence of the pk- The solution to this 
problem |@ is to use a singular value decomposition 
to obtain a numerically regularized inverse B~ x fTlj ]. In 
terms of the latter, one finds from Eq. (0) 



£ = B^B' '. 



(5) 



With the linear variational parameters optimized for 
fixed non-linear parameters in the elementary basis func- 
tions, we optimize -following Umrigar et al. [^2|- the 
non-linear parameters by minimization of 



X 



(6) 



the variance of the local energy of the wavefunction given 
in Eq. (|^) ; again, the guiding function is incorporated via 
= V ()fe) Ms and fo k > = Ti^ k) /ip g . 

We now address the choice of the guiding function ip g . 
To obtain acceptable statistical errors, the sample has 
to have sufficient overlap with the desired excited states. 
In our case, this can be accomplished || with ^ g p = 
■0W with a power p in the range 2 < p < 3, while the 
ground state wavefunction is obtained after a few 
initial iterations. 

The elementary basis functions Jl3|,[l4| are the final in- 
gredient of the computation. They are defined as func- 
tions of all interatomic distances r aT and scaled variables 
r aT = f{r aT )- Here, / maps the interatomic distances 
monotonically onto the interval (—1,1) such that most of 
the variation occurs where the wavefunction differs most 
from zero; the explicit form of / is not important for the 
current discussion. 

The elementary basis functions used for energy level 
k have non- linear variational parameters dj , and are of 
the form 



Si (i?)exp [5>? )fl i( fl )-E 



<7<T 



Kkr a 




(7) 



a, r, v = 1, . . . , N c are atom indices. The polynomial s, is 
characterized by three non- negative integral powers nu: 



E 

<T<T<V I 



tl(r l ar + f l TV + f l va ) n ". 



(8) 



The prefactor polynomial Sj has bosonic symmetry, and 
contains general three-body correlations, since all poly- 
nomials symmetric in x, y, and z can be written as poly- 
nomials in the three invariants /; = x l + y l + z , with 
I = 1,2,3 and v. v. Jl3|] The number of elementary ba- 
sis functions is limited by a bound on the total degree 
^2ilnu; the polynomials Sj in the exponent are of the 
same form as those in the prefactor, and their number in 
Eq. (0) is limited similarly. 

The constant Kk is determined self-consistently so that 
the wavefunction has the correct exponential decay in the 
limit that a single atom goes off to infinity. Assuming -as 
is plausible for the small clusters studied here- that the 
energy of a cluster is roughly proportional to the number 
of atom pairs |l5[ , we find 



-mEu 



N, 



Nr. 



(9) 



The term in Eq. (0) ensures that Ti. Pi / Pi has a weaker 
divergence than with r~^ 2 in the limit r aT — > 0. [fl3f 

States of higher energy are found with the same op- 
timization scheme by using the appropriate eigenvector 
of the matrix £ in Eq. (§). 

We use the same scaling 

function / for all states, but different non-linear param- 

(k) 

eters and Kk- This scheme works as long as the trial 
functions possess the variational freedom accurately to 
represent the eigenstates. Otherwise, for the Monte Carlo 
samples of the size we are using, states may be skipped 
or spuriously introduced. We found it useful to check 
consistency of eigensystems obtained with this basis set 
and one that includes the variational wavefunctions of 
previously determined, lower-lying states. 

Instead of the hybrid method discussed in this Letter, 
which treats linear and non-linear variational parameters 
differently, we originally attempted to use minimization 
of the variance of the energy for both types of parame- 
ters simultaneously. Although that works for statistical 
mechanical applications, |j| it fails here, except for the 
lowest two or three energy levels. We could only make 
this work (and even then only for small de Boer param- 
eters) for higher levels by first constructing approximate 
wave functions using a conventional approximation, and 
then fitting these functions by the basis functions used 
in this Letter. Finally, the parameter values obtained in 
these fits served as starting values for further optimiza- 
tion, m 

We used the optimized trial wavefunctions as basis 
functions in a correlation function Monte Carlo calcu- 
lation. . Formally, this means that the n elementary 
basis functions Pi are replaced by a small number of func- 
tions exp(— ^t?i)'tp( k \ For this part of the computation, 



the analog of Eq. (Q) is used to compute eigenvalues, 
rather than Eq. (^|). The singular value decomposition 
(SVD) cannot be used for the correlation function Monte 
Carlo because there are too many configurations R a to 
store the required matrices B and B' . However, since the 
optimized basis functions are few in number and roughly 
orthonormal -at least for small projection times t— the 
SVD is not essential in this case. 

Before we present estimates of the excited state ener- 
gies, we discuss the sources of error of this method. ||] In 
addition to the statistical errors, there are two systematic 
errors. For any finite projection time t and in the limit 
of vanishing statistical errors, the energies computed by 
this method are upper bounds to the exact energies. ||] 
In practice, since the statistical errors increase with pro- 
jection time, one should choose the smallest projection 
time such that the projection and statistical errors are 
of the same order of magnitude. To pinpoint that time 
one has to distinguish real trends from false ones due 
to correlated noise. This is always tricky, but a trou- 
blesome detail is that at that point the results tend to 
have a non-Gaussian distribution, JIt]] which makes it 
difficult to produce error bars with a well defined statis- 
tical meaning. In addition, there is the time-step error, 
which arrises because the imaginary-time evolution op- 
erator exp(— tTL) has to be evaluated as the limit r — > 
of [exp(— tTL) + 0(r 2 )] t /' r ', but this error is much easier 
to control. 

Next we present results for excited state energies for 
clusters with up to seven atoms. First, we computed en- 
ergies for trimers of Ne, Ar, Kr, and Xe (to = 7.092 x 
1CT 3 , 6.9635 x 1(T 4 , 1.9128 x 1CT 4 , and 7.8508 x 10~ 5 ). 
Since our variational functions contain general three- 
body correlations, the accuracy of the wavefunctions and 
energies for the trimers can be improved without appar- 
ent limit other than the machine precision. During op- 
timization of the wavefunctions for the trimers, we typ- 
ically start with the ground state wavefunction which 
has prefactor degree of 5 or 6. For the trimers we chose 
not to vary the polynomial coefficients in the exponent 
and simply used the fixed terms required by the bound- 
ary conditions. The quality of the wavefunctions may be 
improved by varying polynomial coefficients in the ex- 
ponent, and for larger clusters it becomes important to 
include such polynomials. 

TABLE I. Energy levels Ek of the rare gas trimers; the 
errors are estimated to be a few units in the least significant 



di 


git. 








k 


Ne 3 


Ar 3 


Kr 3 


Xe 3 


1 


-1.719 560 


-2.553 289 43 


-2.760 555 34 


-2.845 241 50 


2 


-1.222 83 


-2.250 185 5 


-2.581 239 


-2.724 955 8 


3 


-1.142 


-2.126 361 


-2.506 946 8 


-2.675 064 8 


4 


-1.038 


-1.996 43 


-2.412 444 


-2.608 615 


5 


-0.890 


-1.946 7 


-2.387 973 


-2.592 226 



For the optimization we used samples consisting of 
4000 configurations, and we gradually increased the pref- 
actor degree to improve the quality of the trial functions. 
For Ne trimers we performed diffusion Monte Carlo fjsf] 
calculations using optimized wavefunctions with prefac- 
tor degrees up to 14. Although, in priciple, for trimers 
nothing should preclude further improvement, the ob- 
served changes were statistically insignificant in the 12 
to 14 degree range. Table | contains results for degree 
12. 

There was no statistically significant difference be- 
tween time steps r =0.4, 0.2 and 0.1, and thus no no- 
ticeable time-step error. In the diffusion Monte Carlo 
calculations we used 1.3 million Monte Carlo steps. For 
Ar, Kr, and Xe trimers we found that the quality of the 
wavefunctions does not improve beyond the prefactor de- 
gree 10. The results in Table Q for the three more massive 
noble gas atoms were obtained using trial wavefunctions 
with such polynomials. Convergence with respect to the 
time-step was established by comparing r =0.8, 0.6, and 
0.4. The number of Monte Carlo steps is the same as for 
Ne. Except for the energy of the fifth level of Ne, which 
is 0.009 too high, the results in Table | agree with, and 
in some cases improve upon, those of Leitner et al. [fl5| . 

In Table [n] we present results for the energies of the 
first five levels of Ar clusters of sizes 4 through 7. Our 
method allows one to go beyond 7 atom clusters, but as 
one can see from Table || the statistical error increase 
with system size. To obtain more accurate results for 
larger clusters it would probably be helpful to include 
higher order correlations in the wavefunction. In the cal- 
culations for 4 through 7 atom clusters we used a 10 de- 
gree prefactor and an exponent of degree three. Again, 
1.3 million step diffusion Monte Carlo results were com- 
pared for r =0.8, 0.6, and 0.4. 

As to the performance of our method as the mass m 
decreases and the atoms become more weakly bound, we 
find that both the optimization and the projection meth- 
ods begin to fail, because the elementary basis functions 
lack the required variational freedom. This breakdown is 
illustrated in Fig. [j], which contains three energy levels 
as a function of mass for a four atom cluster. The results 
are plotted using variables chosen so that there is linear 
dependence both 



TABLE II. Energy levels of Ar clusters of up to seven 
atoms; the errors are estimated to be a few units in the least 
significant digit. 



k 


Ar 4 


Ar 5 


Ar 6 


Ar 7 


1 


-5.118 11 


-7.785 1 


-10.887 9 


-14.191 


2 


-4.785 


-7.567 


-10.561 


-13.969 


3 


-4.674 


-7.501 


-10.51 


-13.80 


4 


-4.530 


-7.39 


-10.46 


-13.74 


5 


-4.39 


-7.36 


-10.35 


-13.71 



for large masses and for energies close to zero. [Q As the 
energy of the levels approaches zero, the scatter in the 
data points increases, and ultimately the method fails to 
produce reliable results. Again, the use of trial wave- 
functions with four-body correlations is likely to make it 
possible to continue to smaller masses. 
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FIG. 1. —\/—Ek of lowest three levels (k — 1, 2, 3) for four 
atom clusters vs m~ 3 . The estimated errors for most energies 
are smaller, than the plot symbols, and increase for decreasing 
mass. Missing data points indicate that no reliable estimates 
were obtained. The vertical arrows indicate Kr, Ar, Ne, and 
He; the horizontal arrow indicates the classical value -y6. 

As we mentioned before, exponential scaling with di- 
mensionality limits the applicability of the DVR method. 
We can only speculate how the Monte Carlo method dis- 
cussed here scales, since the accuracy of the results is de- 
termined mostly by the elusive quality of the basis func- 
tions. It is precisely the degradation of the trial functions 
which is responsible for the big differences in accuracy 
among the results we presented. Clearly, more highly 
excited states have more structure, but the harmonic ap- 
proximation suggests that the corresponding increase in 
complexity scales with a low power of the excitation level. 
It is plausible that it suffices to include in the elementary 
basis functions n-body correlations upto some finite order 
n only, which suggest polynomial complexity for the com- 
putation of the basis functions. Much evidence suggests 
that the projection stage of the calculations also scales 
with a small power of system size. We performed the 
projection part of the calculations by a variant of pure- 
diffusion Monte Carlo |R],^| ■ The statistical noise of this 
approach for large systems increases exponentially, but 
there are alternatives to avoid this, and there it is likely 
that the usual power law behavior of diffusion Monte 
Carlo methods [2(J can be recovered. Let it suffice to 
say that the longer runs typically took a few hours on 
a four processor SGI Origin 200, to produce the results 
presented here. 
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